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Information Systems at Facific Telephone. 


In this study, the results of a modeling effart by the MVS 
Capacity Flanning Group in the Information Systems Organization 
(TI50) of Pacific Telephone will be presented. Specifically, a 
large CICS online system, and a large multi-CFU IMS system were 
madeled using the CAFTURE/MVS and BEST/1 packages fram BGS 
Systems, Inc., of Waltham, Massachusetts. 


Currently, there are four data centers in IS0 - two in the 
nerthern region of Califarnia, and two in the southern. TSQ runs 
many production IMS online systems, some of which are Centrally 
Developed Systems (CDS*s) from Rell Labs or ATET. In fact, in the 
northern region there are twelve IMS systems, gix of which are 
CDS*s. The southern region runs five production IMS systems, four 
of which are similar to the north*’s CbDS"s. In this study, we 
focus on one IMS application, which runs as two systems in each 
region. It is a CDS. The CICS application runs as a system in 
the north, and in the south, both with similar transactian 
volumes. This system was developed at Facific Telephone, and is 
therefore not a CDS. 


Capacity Flanning. 


The MVS Systems Capacity Flanning Group in ISO was formed years 
ago. But, in early 1981, the make-up of the group changed, and it 
has remained samewhat similar up to now. During 1981, the group 
consisted of four people - a technical manager, an ex~-systems 
programmer, an ex-performance analyst, and an ex-statistician. 

For that year, most of their concerns were with CFU capacity 
planning. One year later, a DASD capacity planner was added. He 
WAS an ex-computer operator. 


During the period 1981 thru 1982, CFU and DASD capacity planning 
were done using a technique similar to that used in USAGE, from 
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IBM. The technique used linear projections. The greatest strides 
were made in the area of the materials which were presented to 
peer groups and upper management. A “Systems Flan* was created, 
which showed workload growth in a graphic, rather than tabular, 


format. It was very successful. The technical manager was 


promoted, as was the @ex-statistician. The ex-systems programmer 
maved an ta another assiqnment as a manager, and the ex~-perform 
ance analyst now leads the MVS Capacity Flanning Group. 


Techniques for capacity planning which relied on queueing theory 
were mob used during the early years. During those two years, 

very little modeling of the production systems was dane, because 

of where the priorities stood. But, in @arly i983 the emphasis 
beqan to shift, and modeling began to take on a significant role. 
With reduced life expectancies for the new large systems hardware, 
it became increasingly important to imprave the accuracy of the 
three-year applications hardware forecasts. Analytic modeling, 

it was thought, would fit the role. {Indeed it has, and the results 
of two studies are presented here. 


Capacity Flanning vs. Ferformance Management. 


The ISO MVS capacity planners are not performance analysts. Some 
experience with performance analysis tools has been greatly 
beneficial, but the organizational reporting hierarchy has the . 
planners and performance persons reporting under different lines. 
In addition, the planners do not do performance analysis, and 
whatever ideas they have for improvements have never carried much 
clout. The planners are most concerned with making accurate six 
manth to three year hardware forecasts. The ISO performance 
analysts are concerned about a much shorter time frame. 


The CICS application is seven years old, and is a locally 
developed system with over a hundred different task types. It is 
a very large application, with over 2000 terminals online in each 
region during production hours. The applicatian runs on ITRM 
S081-E madels in both regions. The northern system was modeled 
during April 1983, at which time it ran on a 1lé Meg, 16 channel 
system, Seventy-three S350 DASD were accessed during the modeling 
period. Two tape drives were anline for lagging purposes. 
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Inherent in any analytic modeling exercise are a number of risks 
and assumptions. 
bath items, especially when doing modifications or sensitivity 

analyses for projectian to future scenarios for the application oar 
its environment. 
modeling used far extrapolation into the future are numerous. And, 
Many involve unexpected changes in the application or its enviran- 


ment. 


madeling this CICS 


Assumption 


A. 


fia 
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Tt is most important that the analyst understand 


The possible risks associated with analytic 


Here, are the epecific assumptions and risks involved with 


application. 


The CICS tasks (ar transactiong) can be 
clustered into different types. There are aver 
100 different task types for this application. 
Early in the analysis, several variables (task 
average CFU time, wait time, 7T/0 time for 
various servers, number af terminal responses, 
and athers) were thought to be needed to break 
the CICS tasks intoa many clusters. SAS cluster 
analysiec was used. The results were incan- 
Clugive, because no consistent set of clusters 
was to be found over different measurement 
intervals. Instead, anly twa clusters were 
used, and they were based on wait time, mot 
PESQurce usage. 


Cluster-1 included those tasks which had wait 
times less than one second. Cluster-2 included 
all other tasks. The CICS application has 
interfaces to other applications, and many of 
the associated interface tasks have average 
wait times greater than a secand. All online 
tasks have very short wait times. By count, 


Lit Biel 


934 Of all tasks were Cluster-1l types. 

The dispatching priority option in BEST/1 was 
used, and it wae assumed that Cluster-l tasks 
have higher priority than Cluster-2 types. 
Priority modeling operates on a preemptive 
resume priority discipline. 


CICS ran an a SO01-D dyadic processor. Tt is 
assumed that a theoretical limit of ane-halft 


‘the system's processing time is available far 


CICS use, because of CICS’ s single address space 
nature. In the model, two CFU servers were 
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used - a server which accomodated CICS, and one 
which served other workloads. This is the one 
heroic assumption in modeling CICS. 


Memary was not a problem on the CICS 2081~-D 
system. CICS runs in a fixed six megabyte 
region. There is "no* paging. CICS is the 
single workload (tof any importance) which runs 
an the system. 


In the CICS model, only basic modeling was 
done. No I/O subsystem modeling was 
attempted. In BEST/1 terminology, this 
assumption means that every workload 
transaction accesses each af the workload 
group's servers. The resulting model was 
satisfactory for CFU capacity planning, though 
inadequate for any performance analysis 
requiring I/O subsystem detail. 


Now, the risks involved with modeling this CICS system. 


irr: 


gut 


ae 


a0 
arr 


kk 


he 


1. 


SHARE 61 Presentation 


Only @a Single busy hour of a Single day was 
modeled. However, this was primarily because 
af CICS data set movement over several days. 
We just couldn*t do any service time averaging 
for the DASD, because we didn’t know where the 
servers were from one day to the next. So, 
considerable time was spent looking at SAS 
sysouts, to find a good modeling periad. 
felt that we chose a typical measurement 
interval. 


It is 


Five workloads were modeled: CICS Cluster-l, 
CICS Cluster-2, Test Batch, TSO, and Overhead 
(or Other). ISO uses very little TSO, and 
ROSCOE is used as the card-image editor. There 
was a minor problem with the IFS, as ROSCOE*s 
performance group was the same as that for the 
Overhead workload. ROSCOE should have a 
separate FG sometime soon, and then we can 
treat it as a separate workload. 


RMF and FAII measurement intervals were out of 
sync by eight minutes, or 13%. Get your 
Measurement intervals started on the hour or 
half-hour ! 
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R. 4. There was high CFU unaccounted time, which is time Po. OS Run the BEST/1 model. Validate the model. 


that is unassociated with any workload. BGS “ This may involve combining workloads, 

Systems tech support says that they have ae distributing CFU unaccounted time, eliminating 
up to 20% CFU unaccounted time figures for 3081 servers, etc. Time involved: three days. 
series mainframes at some installations. Here, ; , 

we had a 10% figure. HOw this time was Ps 5 Run the sensitivity analyses. Vary task rates, 
distributed among the five workloads made . alter the CFU processing speed, add or elimnate 
accurate model validatian of CFU utilization workloads. Analyze the eile. Pane 

Bone howe: involved: four days. 

Bee Tele nee lees Nc: ie ees eta. ey =O £0e a Pa Fe Fresent the results. This will involve either 
ES Cy Pere. ey a eS) anole ee ee making a vieworaph presentation or writing a 
assumed that homogeneity existed, and all tasks report. The time involved here is variable, 
needed all servers. Clearly, this is not the anywhere from a couple of days to several ; 
case, as Cluster-1l tasks use different servers weeks. Here, we made a viewgraph presentation. 
than Cluster-2 tasks. We couldn’t separate Time anvolveds £ive days. 


them, though, so homogeneity is assumed. 


It iss worthwhile to mote that the time phases are sequential, and 
except for Fhase 7, the number of people involved in each phase 


The time involved to model the CICS system can be broken into has little effect on the expected total time to complete the 


seven phases. Although the time associated with each phase will phases. For this effort the total time involved was about twenty- 
not be the same for every modeling effort, the phases involved two days, or one working month. It is interesting to note that, 
Wet ne STOLL ars discounting the effort involved in presenting the results, running 
hie the sensitivity analyses involved less than one-quarter of the 
Phases. time. Building the initial model, and validating it is where the 
Si ificant amount f time were spent. 
Py. de Choosing an application or system to model. Reiner a te a ere 
Leas ps ona Sua aces as a Capacity planners involved in modeling need significant help from 
= nae se Ti aa Ne Se Ve Lves ene friendly co-workers within the company. from vendors, and from 
ae Ra asa eee are cantacts in the industry. In the CICS effort we got considerable 
po Establishj 1; - ers ; oh help from the CICS application’s planning and programming suppart 
aca nett ees oe ee aeeaaaee ce re laa Aes a nae group. Mostly this involved learning about the makeup of the 
come acts a earning about their data sources. application tasks, and getting SAS source code to examine the 
Time involved: three days. tasks 
Eee oes Gather preliminary data. One must choose an 
appropriate interval to model. Reading through CICS Model Wale usts on 


many SAS sysouts was dane during this phase. 
Time involved: five days. 


The CICS model was validated in three areas. First, the CICS 
application CFU utilization was validated. Second, the average 
internal response time for the average of the Cluster~-1 and 
Cluster-2 tasks was validated. Last, the maximum task throughput 
rate was validated. Table 1 shows model validation results. 


FP. 4, Run the data extractor (CAFTURE/MVS). Run the 
Analyzer against the extracted data. Gather 
any other data for the modeling interval. Four 
over the sysouts to be certain of the validity 
of the chasen interval. Note that there was 
aniy ane interval chosen in this study, sa no 
averaging over many intervals was done. That 
would have lengthened considerably this phase’ 
time. Time involved: two days. 


ui 
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TABLE 1. 
CICS Model Validation 


Validation Variable Data Source Figure Difference 


CICS CFU Utilization QCM Saw lite 
Model woe Lt 4.8% 
System CFU Utilization QCM 4S. oA 
Model Ae Sh ~O. AN 
Average Internal FATT 0.26 sec 
Response Time Model 0,26 sec .0 
Estimated Maximum CICS Group FO /hour 
Task Rate Model BOK /hour cae ren Oy 4 


As can be seen from the table, system CFU utilization far the 
system on which the CICS application was running is in good 
agreement with the total obtained from the software monitor QCM, 
from Dusquene Systems, Inc. The total unaccounted CFU time was 
about 10%, and this is an important figure when validatiing CFU 
Utilization for the application of interest. In this case, we 
attributed 40% of the unaccounted time to CICS, and the remainder 
to an Overhead workload. Generally, validation to within ten 
percent of the utilization figure is considered acceptable. 


Table 1 also shows that the average internal response time (R.T.) 
for all CICS tasks is in exact agreement with Ferformance Analyzer 
II (PAIT) data collected by the CICS applications group. Note 
that these figures are not Soth percentile R.T. figures, but 
rather averages of the estimated response times for all tasks. 
Rule af thumb says that the 95th percentile figures are going ta 
be approximately three times as long as the average R.T. figures. 
A 935th percentile of one second requires an average R.T. of about 
O,33 sec. Generally, Service Level Agreements list 90th or 95th 
percentiles for R.T. Here, we would expect 9S4% of all tasks to 
have internal response times less than one second. Also, this 
CICS application was sitting on the knee of the response time 
sensitivity analysis (RTSA) curve. When this is the case, model 
Validation becomes difficult, soa these exact response time 
Validation results are all the more impressive. 


Finally, an atypical validation parameter was used as the final 
check in the validation process. Long aga, an estimate of the 
maximum task rate for the CICS application was made by those close 
to the application. They estimated 90,000 tasks per hour, on 3081-D 
hardware as the maximum throughput rate. Here, the model shows 
89,000 as the absolute maximum. 


SHARE 61 Presentation August 198s Fage 7 


After validation, the task throughput rate was increased, such 
that the first bottleneck could be found. Assuming that the [/0 
subsystem could be tuned (dataset and pack movement), the first 
bottleneck which was found was the CFU. In typical modeling analysis 
fashion, the CFU “SFEED" parameter was increased. The 3081~-D was 
upgraded to a TOBI-E, by using a SFEED of 1.4 - our performance 
factor estimate for a FO81-K is 1.4 times a 7O81-G. Assuming that 
the one dataset which resides on a 3250 with 35% utilization was 
not a problem, the sensitivity analysis could proceed from there. 
Other bottlenecks were then found. The CICS model assumed that 
all DASD packs were hit by all tasks. In truth, this is simply not 
the case. The task mix is important. But, finding individual 
task service requirements from all servers is an overbearing 
occupation. Capacity planners want improved forecasts, foremost. 


An analysis was done changing the hardware to a large uniprocessor, 
and those results are presented in the next section. But, for 

now, let is be stated that this CICS application wants to run on 4 
large uni, as is well-known by us and the vendor community. 


CICS Modeling Results. 


The results of the CICS modeling experience are summarized in the 
four graphs which follow. Figure I.A shows CFU Utilization 
Sensitivity Analysis (CUSA). Figure 1.8 has the Response Time 
Sensitivity Analysis (RTSA). Figure I1.C shows the [/0 Bottleneck 
Sensitivity Analysis (IOBSA). And, Figure I.D shows ‘Utilsecands’* 
Sensitivity Analysis (USSA) curves for a situation where CFU 
hardware is upgraded, as a modification analysis. Utilseconds is 
a term we coined, and refers to the way we use the ordinate axis 
twice. 


CUSA (Fig. I.A) can be used to estimate CFU utilization for future 
workloads, assuming similar configurations to that of the modeling 
period. In the figure, percent utilization is plotted against 
hourly task volume for two different CFU models. A CICS CFU 
threshold line is shown for the case where only one-half of the 
dyadic hardware is ‘used’ by CICS. Finally, a reference line is 
placed at the peak load as it e@xisted at the time of modelings 
70,000 tasks per hour. As can be seen from the graph, *2081-D 
utilization at peak load is about 28 percent. Moving over ta the 
SQOB1-K curve, at an equivalent percentage, the model shows a task 
volume of about 190,000 per hour. This is a 40% improvement in 
throughput, and follows directly from our estimates of the TOS1—-E 
performance factor being 1.4 times a FO81~-D. 
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RTSA (Fig. I.B) is used to review response time estimates for future 
workloads. Average response time is plotted against hourly task , 
volume for the two different CFU models. The current peak load 
yields an average response time of %.2 sec, for a 2081-D. This is 
slightly higher than the validation figure, 0.26, because the val-~- 
idation task volume was 62,000 tasks per hour. Note that a similar 
response time on the SO81—-K projects cut ta about 199,000 tasks 

per hour. This is 34% improvement over the SO681-D. Remember that 
the enviranment as it was modeled sat directly on the knee of the 
response time curve, an unenviable position. There were same queg- 
tions as ta whether the curve was accurate. So, data for five ather 
days has been placed on the 20981-D curve, where the task rates 
differed fram the 70,000 figure. These data fall on the RTSA curve, 
and because of this it is believed that the modeled environment is 
ane where the daily peak load falls on the knee of the curve. 


TORSA (Fig. I[.€C) shows percent DASD pack utilization against the 
hoaurly task volume. The two curves are for the modeling periad’s 
two high use packs, bath of which were IBM 2A50*s, Nate that when 
the task volume increases to the 100,000 per hour figure estimated 
by the CUSA and RTSA curves for the 30O81-K, the first I/0 
battleneck pack approaches SO*% utilization. This may be a 
Situation in which further analysis of the I/O subsystem is 
required. The modeling assumption which required a stable, 
well-tuned enviranment may be invalid in this high use Situation. 
Foassibly, dataset placement or redesiqn considerations should be 
looked at when a situation such as this takes place. At present, 


the CICS application support group is looking into a redesign of 
the I[/0 subsystem, to lower the usage of the high use packs). 
CICS I/0 BOTTLENECKS SENSITIVITY ANALYSIS 

198 BEST/1 MODELING APRIL 1983 
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Finally, USSA (Fig I.D) places the CFU utilization and the 
response time curves on a single graph. In this scenario, a CFU 
modification to 2.5 times the 3081-D is made. And, because CICS 
is best suited for uniprocessors, a uni is assumed, so that the 
effective upgrade is 5.0 times one-half of the 30O81-—D. 
*Utilseconds*® is a new unit of measure, and it is plotted against 
the hourly task volume. Utilseconds is actually two variables 
rolled into one ~ 1) for the CFU utilization curve utilseconds 
represents utilization, and 3) for the response time curve 
utilseconds represents time in seconds. Utilseconds can be used 
whenever the response time is less than one second. From Fiq I.D, 
it can be seen that the knee of the R.T. curve is beyond 160,000 
tasks per hour. And, R.~T. up to that point is less than 0.15 
sec/task. Unfortunately, our assumption about a well-tuned 
environment will be invalid long before that point. and a 
different I/O configuration or design will have to be used. 


CICS BEST/1 SENSITIVITY ANALYSIS 
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The IMS system is about one year old, is a CDS, and currently 

has less than ten different transactions types. The IMS system is 
involved in the mechanization of part of the business, and there 
is considerable corporate conversion of offices to the online 
system. Currently, there are about 1100 VDT*s in the oanline 
network. Over time, the IMS system will add many features, and 
will triple the number of users. For example, the system which 
was modeled lets users do inquiries only on the data base, whereas 
later releases will have update capabilities. The IMS system ran 
on a 24 Meg, 24 channel 2O81-K, during May 1983. Dual logging was 
done to tape, and 103 3250 DASD volumes were accessed during the 
madeling period. 


Here are the assumptions and risks which are associated with the 
modeling effort for the IMS system. 


Assumptions 


A, 1. The environment in which IMS ran was a 
well-tuned and stable environment. In fact, 
during the period fram which modeling data was 
gathered, the environment was stable. 


A. 2. All transactions can be processed in all MFR*s © 
(Message Frocessing Regions). Actually, if the 
number of MPR*s is N, then about 90% of the 
transactions are processed in N-1i regions. 

These are “Wait for Input’® transactions. The 
other 10% of the transactions are normal IMS 
transactions, and are processed in the other 
MFR. The IMS system runs in conversational mode, 


A, 3. Once again, we did basic modeling - no 1/0 
subsystem modeling was done. 


And, the following are the risks involved with modeling this IMS 
system. 


i$ 
Fr " 


— 
2 


Once again, we modeled a single busy period of 
a single day. 


RK. 2. The DC Monitor interval, used to find 
transaction types and volume, and the RMF /SMF 
extracted interval were different. The RMF /SMF 
interval was 20 minutes long, and the DC 
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Monitor interval started 10 minutes into the 
RMF /SSMPF interval, and lasted far only 10 
minutes, 


= There were neither BMF*s nor IMS batch running 
during the 20 minute extraction interval. This 
ig considered ‘normal* operation during the 
online day. But, the worst response time 
occurs when the batch is running. 


R. 4. The IMS control region and MFPR*s have different 
performance groups. But, when the model was 
built, they were grouped tagether into one IMS 
workload. They cannot be effectively modeled 
separately. This is a risk of no consequence. 


R. OG. The CUTREVICES facility of the CAFPTURE/MVS 
extractor was used. There were over one 
hundred 2350 DASD volumes accessed by the 
syetem of interest. Fifty of them had device 
utilizations less than one percent. They were 
ignored with the CUTDEVICES command. A Delay 
server was added to compensate. 


RF. 6 Once again, high CFU unaccounted time was found 
on this S0O81-K. And, once again, the Overhead 
workload got the bulk of that time. 


R. YY. Wo log tape data appeared out of the Extractor 
run, and there was dual logging being done. 
This was because IMS bypasses the EXCF driver. 
The service time was estimated and the tape 
servers were put into the model. 


PR. ©. Four workloads were modeled: 1) IMS, 2) Test 
Batch, 3) TSO, and 4) Overhead. The TSO 
transaction rate was 440 trx/hour. [It?s CFU 
utilization was less than one percent. 


The phages of time involvement are explained more fully in the 
CICS Modeling section of this report. The summary for the IMS 
effort follows. 


PF. i. Choose an application. Time involved: 
insignificant. 


PF, 2. Establish contacts. Time involved: one day. 
The IMS systems programming contacts had been 


made previous to this effort. 


FP. 3. Gather preliminary data. Time involved: five 
days. 
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FP. 4, Rum the data extractor. Time involved: one 
clay. 


Pe Me Run the Hest/1 model and validate. Time 
involved: three days. 


FP. & Run the sensitivity analyses. Time involved: 
eight days. This application has a Service 
Level Agreement. We ran quite a few cases of 
madification analysis. 


FP. 7. Fresent the results. That includes this 
report. The INS section of this report took 
about ten working days to complete. Generally, 
written reports include all of the material 
used for viewgraph presentations, and more. 


The time involved totals twenty-eight working days, so the project 
took about six weeks to complete. There was considerable contact 
with the IMS Systems Frogramming Group throughout the study. This 
was necessary because of our limited experience with IMS. Since 
the IMS application is a new CDS, the company’s insight into the 
code and application may not be equal to that of the locally 
developed CICS application. Consequently, we relied on the IMS 
Systems Graup. 


IMS Model Validation. 

The validation of the IMS model was slightly different than that 
for the CICS model. In CICS, one hour’s extractor data was © 
Validated against one hour’s QCM data, one hour’s PAITI data, one 
hour of RMF/SMF data, and an estimate of maximum task throughput 
rate. For the IMS model there was one-half hour of extractor 
data. And, it was validated against one-half hour of O@CM and RMF 
data, and 190 minutes of DC Monitor data. That is a three to one 
ratio of RMF to DC Monitor intervals. Also, there was no prior 
estimate of the maximum IMS transaction rate. 


IMS validation was done on CFU utilization, verifying against RMF 
and QCM data. Second a validation was done against average 
transaction residency time, comparing the model figure and the DC 
Monitor figure. BEST/1 computes average transaction residency 

time. The DC Monitor figure to compare against is the sum of the 
mean scheduling and termination time, the mean schedule to first 
DL/I call, and the average mean internal elapsed time per 
transaction. The BEST/1 figure was 0.34 sec. The DC Monitor showed 
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0.38 sec. The difference is -10.5%. Considering the three-to-one 
interval ratio, the R.T. difference is acceptable. The BEST/1 
model figure for CFU utilization for the IMS application was 298%. 
The QCM figure was 28%. The difference is zero because of the way 
we distributed the CFU unaccounted time. Table 2 summarizes the 
validation. 


TABLE 2. 


IMS Model Validation 


Validation Variable Data Source Figure Difference 
IMS CFU Utilization acm 2B% 
Model aon O. 
System CFU Utilization QCM 40% 
Model HO“ QO. 
Average Transaction DC Monitor 0,28 sec 
Residency Time Model 0.34 sec cae oS res oA 


After validation, the transaction throughput rate was increased, 
$0 that the first bottleneck could be found, just as was dane in 
the CICS modeling exercise. Once again, assuming that proper 
tuning on the 1/0 subsystem could be done, we could estimate the 
first system bottleneck. In this case, it was memory. So, in the 
sensitivity analyses which follow, memory will be varied by 
increasing the MPL (multi-~-programming level) of the IMS 
application. 


Validating the IMS model proved to be an interesting task. There 
were many iterations on the Analyzer, and each iteration produced 
a more accurate model. The first change made to the basic model 
was to use the CUTDEVICES command in the Analyzer step. Of the 
1Oo2 DASD devices, 30 had utilizations of less than one percent. 
CUTDEVICES was used to @liminate those devices. The total service 
time for the SO devices was grouped into a DELAY server. This 
somewhat compensates for the removal of the devices. The DELAY 
server had the second highest utilization, 29% less than the high 
use pack. This produced a more accurate model than the basic 
model. 


Next, it was discovered that the mag tape devices were not being) 
picked up by the EXTRACTOR, sa they were not found by the 
ANALYZER, and subsequently mot found im the basic model. Dual 
logging was done, so two mag tape devices were put into the model 
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as servers. Those devices were validated on utilization figures 
campared against RMF figures. 


Finally, CFU unaccounted time was 11%. Half of this time was 
distributed to the Overhead workload, and the workload CFU 
utilizations were validated against QCM utilization figures. This 
concluded the validation phase, and, the sensitivity analyses 
could then be performed. 


IMS Modeling Results. 


The IMS modeling results are summarized in the seven graphs which 
follow. As in the CICS modeling results, there is a CUSA graph 
(Fig. IT.A). There are two RTSA graphs (Figs. II.B and I17.C). 
Fig II.B is a 95% RTSA, and Fig II.C is a 9O% RTSA. Unlike the 
CICS application, the IMS application has a SLA. There are two 
service level objectives for response time. The first is that in 
the normal environment, 95% of all transactions will have an 
internal response time of one second or less (Fig. II.B). The 
second states that in degraded mode, where there is considerable 
contention for resources because of an Gutage, 90% of all 
transactions will have internal response times of three seconds 
or less (Fig. II.C). RTSA graphs for the two SLA conditions with 
two different processor configurations are shown in Fig. I1.D and 
Fig. TI.E. Each figure has a curve for response time estimates 
for a SO081-E and another for a 2084 estimate. 


The TOBSA graph for the IMS model is shown in Fig. II.F. There 
was only one DASD pack which showed significant utilization, so 
only ane curve is on the TORSA graph. The high use DASD pack is 
the spill area for the long méssage queue for IMS. 


The memory sensitivity analysis (MSA) graph is shown in Fig. II.G. 
This is a tri-variable graph, showing probability of overcommitment 
af memory, the normalized memory queue, and the percent of 

response time spent resident in the memory queue, all plotted 
against hourly transaction volume. Two different multi-programming 
levels - four and ten - are shown. 


CUSA (Fig. IT.A) shows that the CFU is not a problem. IMS runs 
well oon a dyadic processor. The transaction rate can be doubled 
from the current 306,000 per hour, and still be less than 60% 
utilization. A quad-complex, with the 3994, estimates 45% 
Utilization at 90,000 transactions per hour. But, other servers 
Will become bottlenecks long before the CFU. 
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The 95% RTSA (Fig. 11.8) shows an interesting fact. The IMS 
syetem, as configured, sits right above the SLA response time 
threshold. The modeling period MFL was set at four. At this MPL 
for max-MPL), the system is sitting an the knee of the response 
time curve. If the MPL. was increased to ten, the horizontal part 
of the curve is slightly greater than one second, and the knee of 
the curve was at about 69,000 transactions per hour. But, CSA 
limitations are imposed, and a MPL of ten cannot be reached. A 
MPL af something slightly greater than four seems feasible, 
however, Nate that a MPL of goreater than 19 shows little payoff. 


Since the data extracted for the modeling period was during a 
narmal aperating condition, the SOX RTSA (Fig. II.€C) is not a true 
picture af the response time curves for the degraded mode. But, 
it does give a picture of how much “response time slack* there is 
in the system. 

IMS 98% RESPONSE TIME SENSITIVITY ANALYSIS 
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For a 2084 CFU estimate, Fig. I1I.0D shows that the horizontal part 
of the response time curve is only slightly lowered, to about one 
secand. This shows that a majority of the response time consists 
of either queueing time, and/or data base access time. In fact, 
before it hits the knee of the curve, the response time for this 
IMS application is mostly data base access and wait time. Fig. 
IT.E shows the 90% RTSA “slack* one can expect for the degraded 
mode. 
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IOBSA (Fig. II.F) shows that the high use pack utilization at a 
doubling of the current peak load to 60,000 transactions per hour 
is less than 45%. 
IMS SYSTEM PACK UTILIZATION SENSITIVITY ANALYSIS 
1@@ BEST/1 MODELING MAY 1983 
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Finally, Fig. I1.G shows the memory sensitivity analysis curve 
(MSA). MSA shows, for the IMS workload, three different variables 
graphed for two different MPL’s, against hourly transaction 
VOlume. We see that all three variables represent the same 
characeristic - memory utilization. For both MFL’s, four and ten, 
the following are graphed, 1) probability of overcommitment of 
memory, 2) the normalized memory queue, and 3) the residency time 
percentage spent in the input memory queue. The probability of 
avercommitment of memory is the approximate probability that there 
ig at least one transaction on the input memory queue. The 
normalized memory queue figure is the average number of 
transactions on the memory queue divided by the number of MPL ’*s. 
The residency time percentage spent in the memory queue plus the 
“in and processing or doing IWAITs* percentage equals 100% of the 
residency time. Note that all three lines for both MPL=4 and 
MPLe1lO lie in the same senarate bands. Note that the current 
(MFL=4) peak load of 20,000 transaction per hour shows a memory 
figure af O.3 (Fig. I1.G). Froajection to the MFPL=10 band shows a 
throughput of 65,000 transactions per hour. 
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Using the CPU utilization, response time analysis, pack 
utilization, and memory analysis graphs, we can make projections 


about the resources meeded to process various transaction volumes 
This IMS application, as modeled, can safely process over 5o,000 
transactions per hour, an increase of 2/3 aver the current peak 


load, if the maxigum MPL can be increased above four, 


SUMMARY », 


As has been shaw, a& madeling exercise of a large CICS or IMS 


applicahkian can be valuable @xperience for the large system 
capacity planner, Deep insight into the application is a natural 


byproduct of the modeling experience. In additian, 
ig given the sakisfaction of knowing what are the important 


parameters and variables used to describe the large system, Fram 


future workoaad projections, estimates of key resource consumption 
can be mace. 
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Foremost in the analyst’s repertoire is the understanding of the 
assumptions and risks associated with modeling the application and 
environment of interest. For all modeling efforts, the most 
important assumption the analyst makes is that the environment in 
which the application ran was well-tuned and stable during the 
data extraction measurement interval. Next, with large systems 
the analyst may need ta assume that all transactions require the 
Same resources, and the same service requirements fram all servers. 
To do otherwise would impose heavy constraints on the analyst’s 
time. In fact, with dynamic systems it may be impossible to break 
out the service requirements of each individual transaction. A 
major risk is that the transaction mix will change significantly. 
In that case, the service requirements will undoubtedly change, 
and a different model will need to be formed. As we have seen, 
there are many assumptions and risks associated with any modeling 
effort, and it is imperative that the analyst understand them. 


In the CICS experience outlined in this study, there were a number 
of points made. First, for a large system, with interfaces to other 
systems, it may help the analyst to break out different clusters 

of task types, such that tasks can be scheduled on a priority basis. 
We found that the split between short and long wait time tasks was 
Valuable in this modeling @xercise. The short wait time tasks had 

a higher priority than the long ones. Next, the dyadic processor 
was split into two servers, one which ran CICS, and one which ran 
all ather tasks. This allows the analyst to model an environment 
where total CFU utilization on the dyadic will be greater than S0O%. 


we also found that a knowledge of 


In the IMS large system study, 
The Wait for Input 


the transaction types was valuable. 
transactions dominate this application. We saw that the DC 
Maniter and the RMF monitor were not in sync, so0 the validation of 
the model was difficult. Later, we found that the log tape 
analysis utility can be of some use to the analyst, although we 
did mot use it. In summary, the analyst must know which tools toa 


use at various stages of the modeling process. 


Far large systems, it appears that the CUTDEVICES command of 
CAFPTURE/MYS can be used to eliminate low utilization servers from 
the model. Generally, it is recommended that the analyst use any 
tool which simplifies the modeling requirements, without destroying 
the integrity of the model. CUTDEVICES is one such tool. 


For capacity planners, the modeling experience provides an excellent 
exposure to the application variables which affect system perform- 
ance. The analyst will find that he or she gains a broad know- 
ledge of the application as it will perform under various workload 
and hardware scenarios. This is certainly a valuable understand—- 
ing, and the analyst will benefit from it when projecting the 
resource requirements of the application. When making such 
projections, the analyst will also have a list of assumptions and 
limits which are applied against any sensitivity analyses. Armed 
with such a list, the analyst is ready to present the results of 


the study to management. 


ler) 
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We have found that the best form of presentation for management is 
graphic in nature. Generally, graphs present considerable 
information in a concise manner. Tabular reports have been 
aggressively avoided. We believe this approach offers superior 
final products, with higher degrees of acceptability to 
management. The basic graph to management presents some dependent 
variable, such as utilization or response time, graphed against 
transaction rate. With many dependent variables to be graphed, 
management will find some continuity in being able to find the 
faithful transaction rate on the independent axis. Find what 
suits your immediate management, and use it to your advantage. 
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Abstract: The Research Queueing Package (RESQ) is a tool for constructing 
and solving models of contention systems. A contention system is a col- 
lection of interconnected resources and jobs which demand service from 
these resources. Examples of contention systems are computer systems, 
communication networks, manufacturing systems, office systems and dis- 
tributed systems. We first illustrate the basic facilities available in 
RESQ for representing such systems and provide a simple example in order 
to illustrate their use. 


Next we describe how RESQ has been used as an analysis tool to assist 
in the development of the disk cache portion of the IBM 4967 disk control 
unit for the IBM Series/1 computer system. The discussion here has wider 
application because the same design problems considered for the 4967 will 
also occur in one form or another in disk controllers connected to systems 
ranging in size from the Personal Computer to the top of the line MVS and 
VM systems. Also, programming design has a need similar to hardware 
design as to modeling and understanding sequence relationships and over- 
lap in a complex system with many process steps. Based on such modeling 
experience, it is the authors’ opinion that the RESQ approach involving a 
network of queues and the facility of passive queues is very well suited 
for investigation of many design issues associated with development of 
hardware as well as both operating systems and applications programming. 
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